[DNM] feat(da): support fiber (not via c-node) by julienrbrt · Pull Request #3244 · evstack/ev-node

julienrbrt · 2026-04-13T11:57:04Z

Overview

Support Fiber client (based on https://github.com/celestiaorg/celestia-app/blob/63fbf31cca216fc4e067a9e1b3a3431115c7009b/fibre), but not via celestia node or apex for this PoC
celestiaorg/celestia-node#4892

coderabbitai · 2026-04-13T11:57:12Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 94adb153-a3c0-49fe-a05a-a24b17d355b6

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch julien/fiber

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-13T11:57:20Z

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Apr 28, 2026, 3:05 PM

claude · 2026-04-13T11:57:22Z

Claude encountered an error —— View job

I'll analyze this and get back to you.

codecov · 2026-04-13T12:01:14Z

Codecov Report

❌ Patch coverage is 90.65657% with 37 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.16%. Comparing base (2865d6d) to head (4485d91).
⚠️ Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
block/public.go	0.00%	12 Missing ⚠️
block/internal/da/fibremock/mock.go	90.90%	5 Missing and 5 partials ⚠️
block/internal/da/fiber_client.go	96.74%	5 Missing and 3 partials ⚠️
pkg/sequencers/solo/sequencer.go	61.53%	5 Missing ⚠️
pkg/config/config.go	75.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3244      +/-   ##
==========================================
+ Coverage   62.33%   63.16%   +0.82%     
==========================================
  Files         122      124       +2     
  Lines       12873    13258     +385     
==========================================
+ Hits         8024     8374     +350     
- Misses       3968     3995      +27     
- Partials      881      889       +8

Flag	Coverage Δ
combined	`63.16% <90.65%> (+0.82%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Adds a fibremock package with: - DA interface (Upload/Download/Listen) matching the fibre gRPC service - In-memory MockDA implementation with LRU eviction and configurable retention - Tests covering all paths Migrated from celestiaorg/x402-risotto#16 as-is for integration.

Adds tools/celestia-node-fiber, a new Go sub-module that implements the ev-node fiber.DA interface by delegating Upload, Download and Listen to a celestia-node api/client.Client. Upload and Download run locally against a Celestia consensus node (gRPC) and Fibre Storage Providers (Fibre gRPC) — no bridge-node hop — using celestia-node's self-sufficient client (celestiaorg/celestia-node#4961). Listen subscribes to blob.Subscribe on a bridge node and forwards only share-version-2 blobs, which is how Fibre blobs settle on-chain via MsgPayForFibre. The package lives in its own go.mod, parallel to tools/local-fiber, so ev-node core does not inherit celestia-app / cosmos-sdk replace-directive soup. A FromModules constructor accepts the Fibre and Blob Module interfaces directly so callers can inject mocks or share an existing *api/client.Client. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…#3280) * test(celestia-node-fiber): showcase end-to-end Upload/Listen/Download Adds tools/celestia-node-fiber/testing/, a single-validator in-process showcase that boots a fibre-tagged Celestia chain + in-process Fibre server + celestia-node bridge, registers the validator's FSP via valaddr (with the dns:/// URI scheme the client's gRPC resolver expects), funds an escrow account, and drives the full adapter surface. TestShowcase proves the round-trip: subscribe via Listen, Upload a blob, wait for the share-version-2 BlobEvent that lands after the async MsgPayForFibre commits, assert the BlobID from Listen matches Upload's return, Download and diff the payload bytes. The harness is intentionally single-validator — a 2-validator Docker Compose showcase is planned as a follow-up for exercising real quorum collection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(celestia-node-fiber): scale showcase to 10 blobs, document DataSize gap Upload 10 distinct-payload blobs through adapter.Upload, collect BlobEvents via adapter.Listen until every BlobID is accounted for (order-insensitive, rejects duplicates), then round-trip each blob through adapter.Download to diff bytes. Catches routing bugs (wrong blob returned for a BlobID) and duplicate-event bugs that a single-blob test can't see. Scaling the test also exposed a semantic issue: the v2 share carries only (fibre_blob_version + commitment), so b.DataLen() — what listen.go's fibreBlobToEvent reports today — is always 36, not the original payload length ev-node's fibermock conveys. The adapter can't derive the payload size from the subscription stream alone; surfacing it correctly needs an x/fibre PaymentPromise lookup (tracked as a TODO on fibreBlobToEvent). The test therefore asserts DataSize is non-zero rather than matching len(payload). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…3281) listen.go previously set BlobEvent.DataSize to b.DataLen(), which for a share-version-2 Fibre blob is always the fixed share-data layout (fibre_blob_version + commitment = 36 bytes) — not the original payload length. That diverges from ev-node's fibermock contract and misleads any consumer that uses DataSize to allocate buffers or report progress. The v2 share genuinely doesn't carry the original size, and x/fibre v8 has no chain query to derive it from the commitment. The only accurate path is to Download the blob and measure. Listen now does exactly that before forwarding each event. The cost is one FSP round-trip per v2 blob; can be made opt-out later if it hurts throughput-sensitive use cases. Tests: - Showcase restores the strict DataSize == len(payload) assertion across all 10 blobs. - Unit test TestListen_FiltersFibreOnlyAndEmitsEvent now stubs fakeFibre.Download to return a deterministic payload and asserts DataSize matches its length. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ight subscriptions (#3283) feat(celestia-node-fiber): Listen takes fromHeight for resume subscriptions Threads a fromHeight parameter through the Fibre DA Listen path so a subscriber can rejoin the stream from a past block height without missing blobs. Consumes the matching celestia-node API change landed in celestiaorg/celestia-node#4962, which gave Blob.Subscribe a fromHeight argument backed by a WaitForHeight loop. Changes: - block/internal/da/fiber/types.go: DA.Listen signature now takes fromHeight uint64. fromHeight == 0 preserves "follow from tip" semantics, >0 replays from that block forward. - block/internal/da/fibremock/mock.go: replay matching blobs with height >= fromHeight before attaching the live subscriber. - block/internal/da/fiber_client.go: outer fiberDAClient.Subscribe does not yet expose a starting height (datypes.DA doesn't plumb one), so pass 0 and defer resume-from-height wiring to a future datypes.DA change. - tools/celestia-node-fiber/listen.go: propagate fromHeight to client.Blob.Subscribe on the celestia-node API. - tools/celestia-node-fiber/go.mod: bump celestia-node to the merged pseudo-version (v0.0.0-20260423143400-194cc74ce99c) carrying #4962. - tools/celestia-node-fiber/adapter_test.go: fakeBlob.subscribeFn gets the new fromHeight arg; TestListen_FiltersFibreOnlyAndEmitsEvent asserts that fromHeight=0 is forwarded. - tools/celestia-node-fiber/testing/showcase_test.go: existing TestShowcase passes fromHeight=0. New TestShowcaseResume uploads 3 blobs, discovers their settlement heights via a live Listen, then opens a fresh Listen with fromHeight at the first blob's height and verifies every historical blob is replayed with correct Height and DataSize. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…imental (#3289) Picks up the chained celestia-app bump on celestia-node feature/fibre-experimental, which carries the x/valaddr host:port validation fix (celestia-app PR #7183). Cascading changes required by the bump: - celestia-app v8 → v9 across adapter.go, adapter_test.go, listen.go, testing/network.go, testing/bridge.go (the new celestia-node uses v9, so the consumer must too). - testing/network.go drops the `dns:///` prefix from the in-process validator registration. The new x/valaddr ValidateBasic enforces host:port form, so `dns:///host:port` registrations are now rejected at tx time. gRPC's passthrough resolver dials bare `host:port` directly with no behavioural difference. Verified locally: go vet -tags fibre ./... — clean go test -tags fibre -short -run TestShowcase — pass Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(fiber-bench): single-sequencer ev-node bench against a remote Fibre network Adds tools/celestia-node-fiber/cmd/fiber-bench, a self-contained binary that spins up an ev-node aggregator wired to a Fibre network with the bridge node bypassed, then pumps load into the in-mem mempool to measure throughput end-to- end. Built specifically to flush out the ev-node-vs-Fibre regression where the combined stack hits ~1k tps despite Fibre alone delivering ~1.3 GiB/s. Stripped to keep the measurement clean: - solo sequencer (no based / no forced inclusion) - aggregator-only (no syncer, no P2P) - in-mem core.Executor with constant state root (no state-machine cost) - bridge-bypass cnfiber.Adapter (Upload via consensus gRPC + FSPs only) - direct InjectTx (no HTTP overhead) Includes: - keyring management (test backend, test-only convenience for the bench account) - Fibre escrow deposit/query helpers so the bench is self-contained - per-Upload latency instrumentation (p50/p99/mean/max) so we can split Fibre-side latency from ev-node submitter serialization - live periodic stats (tps + MB/s for inj/exec/da_settled streams) and a baseline summary at end of run Build with -tags fibre — without it the celestia-app x/fibre messages aren't registered in the codec and async pay-for-fibre settlement fails with "unable to resolve type URL /celestia.fibre.v1.MsgPayForFibre". * feat(common): default MaxBlobSize to Fibre's actual cap (128 MiB - 5 B) The 5 MB default left ~25x of Fibre's per-blob capacity unused: Fibre's MaxBlobSize is 1 << 27 bytes (128 MiB) and the protocol's per-blob header is 5 bytes (1 byte version + 4 byte data size, see celestia-app/v9/fibre/ blob.go::blobHeaderLen and protocol_params.go::MaxBlobSize). Anchoring ev-node's default to the actual cap lets each block carry the full ~128 MiB of payload, multiplying settlement throughput at the same per-Upload latency. Also drops the bench's executor.FilterTxs overhead margin: the cap already lives at the right level (Fibre's MaxBlobSize), and reserving extra in the executor would just leave bandwidth on the table again. If proto/metadata overhead pushes a marshaled block over the cap, that should be addressed in ev-node's block producer rather than worked around in test fixtures. The link-time override is kept for callers that want to constrain the default further (smaller cap → smaller blocks → lower per-Upload latency for environments where that matters). * fix(block/executing): reserve proto/metadata overhead in RetrieveBatch's MaxBytes The block producer was passing MaxBytes = MaxBlobSize directly to GetNextBatch, but the marshaled types.Data (txs + Metadata + proto framing) is larger than the sum of raw tx bytes. The per-tx proto length-prefix is ~3 bytes, which is small in absolute terms but adds up to 1.5% overhead at typical 200 B txs and over 1 MB of overhead at peak block sizes (128 MiB). Without reserving this margin, a fully packed batch builds a block that exceeds the submitter's MaxBlobSize check and halts as 'unrecoverable: single item exceeds DA blob size limit'. Reserving in the block producer (rather than in FilterTxs) keeps the executor's view of MaxBytes equal to the raw-tx budget, which is what FilterTxs is meant to enforce. * fix(reaper,cache): make seen-tx retention link-time tunable to avoid OOM under load The seen-tx cache holds a SHA-256 hash for every transaction the reaper ever drained. With CleanupInterval = 1h and DefaultTxCacheRetention = 24h hardcoded as consts, sustained throughput causes the map to grow linearly without ever shrinking until the GC pressure or process memory caps the run. Observed empirically while benchmarking the Fibre DA path: at ~1.5M tx/s the bench OOM-killed after ~80 s with ~16 GB RSS, the cache holding ~120 M entries. Changing both to vars driven by ldflags lets ev-node keep its production-friendly defaults (memory-cheap dedup over a 24 h window, swept once an hour) while letting benchmark builds opt into shorter windows so the cache reaches a steady state. Example for the fiber-bench tool: go build -ldflags "\ -X github.com/evstack/ev-node/block/internal/cache.defaultTxCacheRetentionStr=30s \ -X github.com/evstack/ev-node/block/internal/reaping.cleanupIntervalStr=5s" A real fix probably reaches further (cap entry count, switch to a TTL cache implementation, or bypass dedup when the caller already guarantees uniqueness) but these are larger conversations; the ldflag knob unblocks measurement in the meantime. * fix(fiber-bench/loader): backoff with sleep when mempool is full The loader's drop path was runtime.Gosched + immediate retry, which lets each worker allocate a fresh 200 B tx slice at ~200k iter/s when the executor's mempool channel is permanently full. With --workers=8 that is 1.6 M short-lived allocations/s = ~320 MB/s of GC churn, against nothing useful — the rejected slices never make it into a block. Sleeping 100 us on a failed InjectTx caps the per-worker drop rate at ~10k/s and makes total allocation pressure scale with --workers as a proportional backpressure signal rather than a constant maximum-rate spin. Drops in the live stats line still grow visibly when the mempool is full, just at a sane rate. Without this fix the bench OOM-killed under sustained load even with --max-pending=4 throttling block production: pending blob memory was bounded but GC could not keep up with the loader's allocation rate fast enough to prevent runaway heap growth alongside Badger's L0 backlog and ev-node's pending caches. * fix(common): make defaultMaxBlobSizeStr a string literal so -ldflags -X works A previous version initialised the variable via strconv.FormatUint(...), which Go's linker treats as a non-constant expression — so -ldflags -X silently no-ops the override. Every benchmark that tried to set a smaller MaxBlobSize at link time was actually running with the 128 MiB default, masking what we were measuring. The correct form is a plain string literal in the source. The Fibre cap is documented in the comment so the magic number stays self-explanatory; init() still parses and falls back to the literal value if parsing fails. * docs: TODO(throughput-cleanup) on the DA-blob-vs-raw-tx-budget conflation common.DefaultMaxBlobSize is plugged into two semantically different limits — the raw-tx budget that gates FilterTxs and the marshaled ceiling that gates submitter retries — and the conflation has been the root cause of more than one bug while debugging Fibre throughput (packed blocks marshaling larger than MaxBlobSize, ad-hoc 2% reservations in RetrieveBatch, etc.). File three TODOs pointing at each other and at the umbrella note in common/consts.go so the next person picking this up can do the cleanup atomically rather than adding more workarounds. No behavioral change. * perf(fiber-da): skip flatten allocation on single-item Submit; honor ctx Three changes in fiber_client.go::Submit, all hot-path correctness/ efficiency wins surfaced while debugging Fibre throughput: 1. Single-item fast path that bypasses flattenBlobs. For data blobs, limitBatchBySize already caps each Submit call at one item (each block's data already saturates MaxBlobBytes). The flatten step was therefore allocating MaxBlobSize bytes and memcpy'ing the entire payload solely to prepend the 8-byte count/length prefix used by splitBlobs. At 128 MiB blocks that's ~128 MB held in two places at once during every Upload. The fast path passes data[0] straight through and saves the full copy. Wire-format caveat: a retriever (full-node syncer or light client) downloading a blob written via this fast path can't decode it — splitBlobs always expects the prefix. The right fix is to pair this with a per-item Upload model so flatten falls away entirely; tracked as a TODO in the source pointing at the concurrent-uploads work where that lands naturally. 2. Honor caller's ctx in Upload. The previous context.Background() kept Uploads alive past node shutdown and was the proximate cause of the "payment promise already processed" warnings — a stale Upload would settle on-chain after ev-node had already moved on. Threading the caller's ctx makes shutdown promptly cancel in-flight Uploads. 3. Correct SubmittedCount on error. On a full-Upload failure the result reported len(data)-1 as submitted, which both reads weirdly for len==1 (uint64 underflow risk in any future arithmetic) and lies to submitToDA's prefix-of-success retry advance. Reset to 0 on error. No behaviour change for the multi-item retrieve path (flatten still runs when len > 1). Validated via go build / go vet. * perf(fiber-da): per-item concurrent Uploads on Submit Fan out one goroutine per item in fiber DA Submit, calling fiber.Upload concurrently with the caller's ctx. Settlement throughput now scales linearly with the batch size: previously ev-node's submitter could only have one Upload in flight per stream (header + data, mutex-locked in submitter.go), and each Submit further serialized the batch into one big flatten-encoded blob. With fan-out, a Submit of N items becomes N concurrent Uploads, and Fibre's ~1.5 s per-Upload latency amortizes across N. The result-aggregation honors submitToDA's "prefix of successes" contract: SubmittedCount = N means items [0..N) succeeded and the caller will retry [N..end). Reporting interleaved successes would double-submit blobs and waste escrow; matching prefix semantics keeps the retry contract intact even when individual Uploads fail out-of-order. Pair changes in submitting/da_submitter.go: - limitBatchBySize gains a maxItems cap (was total-bytes-only). Each item is still bounded by maxItemBytes (chain ceiling), but the total batch is now bounded by item count, letting multiple full-size items flow through one Submit. - retryPolicy adds MaxItems with a sensible non-fiber default of 1 (preserves legacy single-item-per-Submit semantics for backends that flatten a batch into one blob). - For the fiber backend, MaxItems is bumped to 16 — covers a 5 min run at 1 b/s production with 4–8 pending blocks while leaving headroom for memory pressure under MaxBlobSize-sized items. Wire-format follow-up (see TODO in fiber_client.go::Submit): the retrieve path in this file still uses splitBlobs which assumes the old single-prefixed-blob format. Per-item Uploads now produce raw blobs with their own BlobIDs; retrieve needs an update to read each BlobID separately. The bench's aggregator-only setup never invokes retrieve so this is unblocked for measurement but blocks merging to production until addressed. * perf(fiber-bench): use in-memory KV store, not disk-backed Badger Block production calls store.batch.Commit() synchronously inside ProduceBlock — which means Badger's write throughput is a hard ceiling on block production rate. At 128 MB blocks × ~1 b/s the on-disk backend generates ~150 MB/s of value-log writes plus heavy compaction churn that backed up under load: vlog files filled (~1.2 GB each) faster than Badger could rotate, and we hit a "file exists" race on .vlog rotation that wedged the producer entirely. The bench has no durability requirement — if it crashes we re-run — so swap to NewTestInMemoryKVStore. ev-node's code path is unchanged (same Batch / Commit semantics), the data just lives in a map. This removes Badger from the critical path and lets the bench measure ev-node's actual pipeline rather than Badger's write-amplification curve. Open question for production fiber rollups: since Fibre IS the storage (a fiber-only node can re-sync any block from the chain), does ev-node need to persist block data to local Badger at all? Possibly worth a fiber-only-skip-block-store mode in the executor, analogous to how the !fiber broadcast paths are gated. Filed informally; not blocking the throughput investigation. * fix(fiber-bench): use ds.MapDatastore, not Badger in-memory Previous in-memory switch used store.NewTestInMemoryKVStore() which is backed by Badger with WithInMemory(true). That mode still enforces Badger's default 1 MiB ValueThreshold, so any block larger than 1 MiB fails to save with: Value with size 133506229 exceeded 1048576 limit Our 128 MiB blocks blow past this on every commit. Symptom in the logs is a stream of 'failed to save block data' errors while the submitter continues to upload pending items from cache — so settlement keeps advancing for already-cached items but new block production halts. Swap to ds.MutexWrap(ds.NewMapDatastore()): a pure-Go in-memory map with no per-value size limit, thread-safe via the standard sync wrapper. Same Batch / Commit semantics ev-node expects, just a thin sync.Mutex around a Go map. The bench has no durability requirement — the Badger reference is kept aliased above the assignment so the dependency stays imported in case we want to switch back via flag later. * hack(store): swap NewDefaultKVStore to in-memory MapDatastore Block production calls store.batch.Commit() synchronously inside ProduceBlock, so storage write throughput is a hard ceiling on block production. With 128 MiB blocks × ~1 b/s the on-disk Badger backend generates ~150 MB/s of value-log writes plus heavy compaction; under sustained load we hit a Badger .vlog rotation race ("file exists") that wedges the producer entirely. Returning a sync-wrapped MapDatastore from the canonical constructor (rather than special-casing the bench) puts the change exactly where ev-node loads its store, makes the diff small and obvious, and lets the bench drop its private MapDS swap to call NewDefaultKVStore the same way every other ev-node binary does. The HACK comment names three real fixes — async commit, fiber-only skip-persistence, write-optimised backend — so this isn't read as "revert to Badger before merge". NewDefaultKVStoreOnDisk preserved as the literal Badger constructor for any caller that explicitly wants disk-backed state today. Reverts the bench-side workaround introduced in 7ed0bf1. * hack(reaper,cache): collapse seen-tx TTL plumbing back to plain consts Previous fix (ecd7f62) made DefaultTxCacheRetention and CleanupInterval ldflag-overridable so the bench could shrink them at link time. That hid the actual change behind 30 lines of init() / parsing scaffolding — the diff said "add tunable" but the operational story was "the default is wrong for any meaningful TPS". Replacing the plumbing with two const edits puts the hack where it belongs, where the value lives. DefaultTxCacheRetention: 24h -> 30s. At ~1.5M tx/s sustained the 24h dedup window grows the cache to ~16 GB in under a minute (each entry is the SHA-256 hex string, ~150 B in map representation), which OOM-kills the bench before any throughput signal is visible. The HACK comment flags 24h as itself wrong: retention-by-wall-time scales poorly with TPS. The proper fix is an LRU-by-count cache, or expressing the window in DA blocks (mempool TTL × DA block time), not a fixed duration. CleanupInterval: 1h -> 5s. Coupled to the previous 24h retention; an hourly sweep against a 24h window means entries can outlive expiry by 1h (fine when retention is days, completely broken at 30s retention where entries would survive 12× past expiry). The HACK comment notes this should derive from retention rather than be a separate fixed value. Reverts the link-time tunability scaffolding from ecd7f62. The bench no longer needs ldflags for these — same hack with the standard build. * docs: surface follow-up issues left by the throughput hacks Three small comment / dead-code edits. None change behaviour; they make hidden assumptions visible so the next person reading the diff doesn't trip on them. block/internal/common/consts.go DefaultMaxBlobSize: flag that the new 128 MiB-5 default is correct for fiber-enabled deployments but WRONG for the legacy JSON-RPC blob client path — bridge / chain reject blobs above their own much smaller cap. The right shape is per-backend caps; the global default was always going to be a leaky abstraction. block/internal/da/fiber_client.go Remove flattenBlobs (dead code now that Submit fans out per item). Keep splitBlobs but document loudly that it can no longer decode blobs THIS branch's Submit writes — the per-item Upload path produces raw blobs while splitBlobs expects the legacy "count + per-item length" framing. Retrieve / Get / Subscribe callers in the same file are therefore broken for our writes; the comment points at the wire-format follow-up that has to land before any node on this branch tries to sync from another. block/internal/submitting/da_submitter.go fiberDefaultBatchItems = 16: flag the magic number as needing a config knob (FiberDAConfig.UploadConcurrency was scaffolded for exactly this earlier and reverted; wire it through here when the concurrent-uploads change graduates from prototype). 16 is a pragmatic measurement default, not a considered production value. * refactor(fiber-bench): delegate node wiring to rollcmd.StartNode The bench was hand-rolling the same node wiring testapp/evm/grpc apps already do via pkg/cmd.StartNode — DA client construction, p2p client setup, node.NewNode call, signal handling, the run loop. Each of those grew its own way of doing things in the bench, drifted from the canonical path, and left a maintenance gap if cmd.StartNode ever gained a new responsibility (which is exactly how the fiberClient parameter regression on this branch happened — testapp was never updated to pass it). Replace the inline wiring with one rollcmd.StartNode call. The bench now owns only what's genuinely bench-specific: - Cosmos keyring open + bridge-bypass cnfiber.Adapter (no production equivalent — bypasses bridge node dialing) - Block-signing key created in homedir, passphrase written to a temp file so StartNode can read it through its standard flag - inMemExecutor + solo sequencer (constant state root for measurement; testapp's KVExecutor recomputes state by scanning every key, O(N) per block) - Loader + stats printer goroutines spawned before the blocking StartNode call; SIGINT-to-self triggers shutdown when the duration timer expires (StartNode's outer select waits on signal/err only — not ctx — so this is the contained way to drive duration through its existing shutdown path). Net diff: ~30 LOC fewer, but the meaningful change is that the bench is no longer carrying its own copy of testapp/evm/grpc's node setup. The bridge-bypass adapter, instrumented Upload latency proxy, escrow helpers, and stats printer remain (those don't duplicate canonical ev-node code; they exist only for measurement and operator UX). Filing for follow-up: testapp/evm/grpc apps still don't compile on this branch because cmd.StartNode gained the fiberClient parameter without updating its callers. The right fix is one of: - testapp/cmd/run.go imports tools/celestia-node-fiber and wires cnfiber.New (with bridge) when nodeConfig.DA.Fiber.Enabled. - Or cmd.StartNode grows a constructor-style overload so callers that don't use Fiber can keep their old signature. Either way, that's a separate piece of work; this commit just demonstrates the canonical pattern from the bench side. * fix(apps): unblock testapp/evm/grpc compile by passing nil fiberClient The fiberClient parameter was added to pkg/cmd.StartNode in commit 87573ae (on this branch's parent julien/fiber) but the three apps that call it were never updated. Branch HEAD therefore had three broken compiles — anyone trying to build a testapp / evm / grpc binary on this branch hit: cmd/run.go: not enough arguments in call to cmd.StartNode Pass nil for the new parameter in each app and document why with a TODO pointing at tools/celestia-node-fiber. None of the three apps currently need fiber DA support — they pre-date this branch's fiber work — and the right way to add it is to construct a *cnfiber.Adapter from nodeConfig.DA.Fiber and pass it through, the same pattern fiber-bench's run.go uses (see commit 57fa859). That work is out of scope for this commit; this is just the "stop the bleed" change so the branch builds cleanly. Three identical comment blocks across the three apps so anyone landing in any one of them sees the same context. * refactor(fiber-bench): reuse canonical config flags via rollconf.AddFlags The bench's runFlags struct had grown ~22 cobra flags, ~15 of which were straight aliases for things rollconf.AddFlags already registers (--block-time → --evnode.node.block_time, --batching-strategy → --evnode.da.batching_strategy, --consensus-grpc → --evnode.da.fiber.consensus_address, etc.). Each alias was its own maintenance liability — defaults drifted from the canonical defaults, new ev-node config fields didn't surface here without manual sync, and operators learned a bench-specific flag dialect that didn't transfer to testapp/evm/grpc. Drop the aliases. Run command now calls: rollconf.AddGlobalFlags(root, AppName + "/node") // --home, --evnode.log.* rollconf.AddFlags(runCmd) // --evnode.node.*, etc. rollcmd.ParseConfig(cmd) → rollcmd.SetupLogger(cfg.Log) …then post-parse forces what the bench requires (Aggregator, Fiber.Enabled, P2P.ListenAddress, Signer.SignerType, Pprof off, Prometheus on, BridgeAddress placeholder for FiberDAConfig.Validate) and overrides canonical defaults that are wrong for a throughput bench (DA block time → 1s, batching → immediate, scrape interval → 100ms, namespaces → fb-bench-{h,d}). Operator flags always win — overrides only fire when cobra reports the flag wasn't Changed. Bench-local flags that survived: --duration, --workers, --tx-size, --mempool-size, --stats-interval, --keep-home, --keyring-dir (cosmos keyring; not the ev-node signer), --signer-passphrase (still writes a temp file consumed by --evnode.signer.passphrase_file; commit 2 will replace this with a real init flow). Default home stays at ~/.fiber-bench/node (passed as \"fiber-bench/node\" to AddGlobalFlags) so the os.RemoveAll(cfg.RootDir) on --keep-home=false runs cannot clobber the cosmos keyring at ~/.fiber-bench/keyring. Updated run-bench.sh and README to use the canonical --evnode.* flag names. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(fiber-bench): inline loader backoff, drop yield.go yield.go was a single-line wrapper around time.Sleep(100us) parked in its own file with a long explanatory comment. The comment moves up to the loaderBackoff const in loader.go (the only caller), the file goes away. No behavioural change. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

julienrbrt added 2 commits April 13, 2026 13:54

feat(da): support fiber (not via c-node)

19ea929

Merge branch 'main' into julien/fiber

960146d

github-actions Bot assigned julienrbrt Apr 13, 2026

julienrbrt and others added 7 commits April 14, 2026 15:12

wip

ef44db2

Merge branch 'main' into julien/fiber

f96ab47

reduce alloc

f3356c6

Merge branch 'main' into julien/fiber

49c92d1

lint

7278685

updates

4485d91

julienrbrt changed the title ~~feat(da): support fiber (not via c-node)~~ [DNM] feat(da): support fiber (not via c-node) Apr 20, 2026

julienrbrt and others added 15 commits April 20, 2026 14:46

wire fiber in testapp (poc)

6472139

Merge branch 'main' into julien/fiber

03b4877

tidy tool

da26572

updates

04c70e7

properly disable fi

9e5b2ca

improve submission

c49fe6f

updates

e26879b

cleanup

7be668a

rm local fiber

24ff04e

fix flags

a4a46e7

cleanups

918acaf

julienrbrt added 11 commits April 23, 2026 18:47

updates

dabfe4a

updates

9c5ee4d

fix flags

f1c9cb8

wip test

3c92481

namespace hack

e84142e

disable p2p when fiber enabled (for now)

3dbc863

remove workers

e159db6

flatten and split blobs for Fiber DA upload

4a23f89

Merge branch 'main' into julien/fiber

dd8d2dc

Initialize Fiber DA client with last known DA height

ece6de8

fix run node wiring

87573ae

julienrbrt mentioned this pull request Apr 24, 2026

refactor(submitter): concurrent submitter #3287

Open

walldiss and others added 2 commits April 27, 2026 14:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DNM] feat(da): support fiber (not via c-node)#3244

[DNM] feat(da): support fiber (not via c-node)#3244
julienrbrt wants to merge 37 commits intomainfrom
julien/fiber

julienrbrt commented Apr 13, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Apr 13, 2026 •

edited

Loading

Review skipped

Uh oh!

github-actions Bot commented Apr 13, 2026 •

edited

Loading

Uh oh!

claude Bot commented Apr 13, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Apr 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

julienrbrt commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Uh oh!

coderabbitai Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

github-actions Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

julienrbrt commented Apr 13, 2026 •

edited

Loading

coderabbitai Bot commented Apr 13, 2026 •

edited

Loading

github-actions Bot commented Apr 13, 2026 •

edited

Loading

claude Bot commented Apr 13, 2026 •

edited

Loading

codecov Bot commented Apr 13, 2026 •

edited

Loading